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INTRODUCTION 


Approximately  85%  of  today’s  weapon  systems  employ  embedded  computers,  and 
there  is  a  trend  towards  more  complex  systems.  In  these  mission  critical  computer 
resources  (MCCR),  a  software  error  can  have  a  drastic  effect  upon  system 
performance. 

There  are  several  problems  facing  MCCR  development.  First,  there  is  the  ever 
increasing  hardware  dependency  upon  the  software  to  successfully  carry  out  its  mis¬ 
sion.  At  the  same  time,  software  development  is  a  relatively  new  and  unproven  technol¬ 
ogy  compared  to  hardware.  Software  development  is  often  performed  in  a  sloppy 
fashion  and  is  improperly  documented.  Use  of  off-the-self  software  is  often  force  fitted 
to  the  application  at  hand.  This  approach  to  software  design  leads  to  errors,  and  if 
detected  late  in  the  development  cycle  can  drive  up  costs  substantially. 

The  major  problem  with  DoD  software  development  projects  today  relates  to  the 
verification  and  validation  of  requirements  in  the  software  program.  Statistics  have 
shown  that  approximately  46%  to  64%  of  software  errors  are  traced  back  to  inadequate 
requirements  and  design.  Of  these  errors,  70%  of  them  are  not  caught  early  in  the  life 
cycle  and  propagate  into  production  and  deployment  (ref  1 ).  The  problem  stems  from 
the  fact  that  many  software  implementations  of  functional  requirements  remain  un¬ 
tested,  and  consequently  errors  are  found  during  use  when  an  untested  path  in  the 
software  is  executed.  The  surprises  will  cost  the  government  both  time  and  money 
when  extensive  debugging  and  reverification  efforts  are  required  to  fix  these  problems. 
The  cost  of  correcting  these  errors  can  be  as  much  as  300  times  the  cost  to  correct  it 
during  unit  testing. 

Software  assessments  for  the  most  part  are  subjective  in  nature.  The  difference 
between  hardware  and  software  quality  assessments  is  the  lack  of  measurable 
parameters  for  software.  For  this  reason  there  is  an  increasing  drive  to  develop  and 
apply  techniques  which  provide  a  quantitative  means  of  measuring  or  assessing 
software  quality. 

The  primary  means  of  performing  software  quality  assessments  is  through  inde¬ 
pendent  verification  and  validation  (IV&V).  This  includes  verifying  that  requirements  are 
met,  through  qualitative  specification  reviews,  having  adequate  documentation,  and 
testing.  Testing  in  support  of  software  quality  assurance  (SQA)  is  very  labor  intensive. 
In  fact,  the  norm  for  software  testing  is  about  50%  of  the  total  software  development 
effort  (ref  1 ).  In  order  to  meet  program  deadlines  and  cost  constraints,  the  testing  effort 
is  often  cut  short,  leaving  doubt  as  to  the  quality  of  the  software. 
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The  problems  mentioned  above  were  addressed  by  the  Software  Quality 
Assurance/Math  Branch  of  the  U.S.  Army  Armament  Munition  and  Chemical  Command 
(AMCCOM).  A  technique  was  devised  that  would  help  verify  the  quality  of  the  software 
through  the  use  of  quantitative  measures.  The  assessment  technique  in  the  form  of  an 
automated  tool  would  increase  productivity  and  efficiency  of  available  manpower, 
reduce  subjectivity,  reduce  fielded  system  failures,  and  consequently  would  reduce 
development  and  maintenance  costs.  The  result  of  this  effort  was  the  automation  of  the 
cyclomatic  complexity  metric. 


THE  CYCLOMATIC  COMPLEXITY  METRIC 

The  cyclomatic  complexity  metric  is  documented  in  the  U.S.  Department  of  Com¬ 
merce  N  j'lional  Bureau  of  Standards  (MBS)  Special  Publication  500-99,  "Software 
Testing  A  Software  Testing  Methodology  Using  the  Cyclomatic  Complexity  Metric"  (ref 
2).  It  is  based  upon  structured  programming  conventions  and  graph  theory.  The  idea 
behind  the  metric  is  to  measure,  quantify  or  evaluate  the  complexity  of  a  software 
module.  Software  which  is  less  complex  can  be  comprehended,  is  easier  to  maintain, 
can  be  tested  thoroughly,  and  is  less  likely  to  have  embedded  errors.  The  ideal  limit  of 
complexity  is  10  for  any  software  module.  A  module  of  complexity  greater  than  10 
would  need  to  be  broken  down  into  smaller  submodules. 

The  concept  behind  the  metric  is  simple;  one  counts  the  number  of  control  tokens 
which  exist  in  the  software  module  to  determine  complexity.  Complexity  can  be  calcu¬ 
lated  as; 


Complexity  =  The  number  of  control  tokens  +  1 

Control  tokens  are  programming  language  statements  which  in  some  way  provide 
provision  points  which  modify  the  top-down  flow  of  the  program.  In  other  words,  state¬ 
ments  such  as  IF-THEN-ELSE,  CASE,  GOTOs,  are  considered  to  be  control  tokens 
since  they  base  program  flow  upon  a  logical  decision,  thereby  creating  alternate  paths 
which  program  execution  may  follow.  Thus,  at  the  same  time,  the  technique  also 
identifies  the  critical  paths  needed  to  exercise  every  line  of  code  in  the  module.  A 
module  of  complexity  five  would  have  five  critical  or  basis  paths.  These  paths  can  then 
be  used  to  adequately  test  software  modules  while  minimizing  the  extent  of  testing 
required. 

The  metric  can  be  used  throughout  the  entire  software  life  cycle.  Applying  a  limited 
complexity  as  a  contractual  requirement  will  force  structured  programming  techniques. 
Applied  during  software  development,  the  metric  will  limit  the  number  of  basis  paths  in  a 
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program  at  the  design  and  coding  stages.  It  can  be  used  during  software  testing  to 
identify  the  basis  paths  and  to  minimize  the  testing  effort.  During  the  maintenance 
phase,  a  proposed  change  should  not  be  allowed  to  substantially  drive  up  the  com¬ 
plexity,  whereby  increasing  the  testing  effort. 

The  cyclomatic  complexity  metric  allows  you  to  quantitatively  assess  the  software. 
As  discussed  earlier,  having  this  metric  incorporated  in  the  form  of  an  automated  tool 
would  have  substantial  benefits  as  well. 


COMPLEXITY  ANALYSIS  TOOL 

The  complexity  analysis  tool  (CAT)  is  an  automated  tool  which  is  based  upon  the 
cyclomatic  complexity  metric.  It  is  designed  to  run  on  an  IBM  PC  AT  under  a  MS  DOS 
operating  system,  as  well  as  on  a  DEC  VAX  under  a  UNIX  (AT&T  5.2)  operating  sys¬ 
tem.  CAT  will  analyze  an  ASCII  source  code  file  and  will  identify  the  various  control 
tokens.  It  will  then  generate  a  graphic  representation  of  the  logic  flow  paths,  called  a 
data  flow  diagram  (DFD),  similar  to  a  flow  chart.  The  basis  paths  can  also  be  displayed. 
CAT’s  interface  consists  of  a  series  of  user-friendly  menus  which  guide  the  user  through 
execution  of  the  tool. 

The  first  thing  CAT  will  ask  for  is  the  language  being  analyzed.  CAT  currently  has 
the  ability  to  analyze  programs  written  in  BASIC  (HP-71),  PDL  (Caine,  Farber,  Gordon), 
Equate  ATLAS,  Ada  (DOD-STD-1815A),  and  Ada  PDL.  After  selecting  an  appropriate 
file,  the  tool  will  pass  the  file  through  the  appropriate  language  parser  and  perform  the 
metrics  analysis.  CAT  operates  under  the  assumption  that  the  file  can  be  compiled 
successfully.  If  not,  an  appropriate  error  message  will  be  raised.  Upon  successful 
completion  of  the  parser  and  metrics  analysis,  CAT  is  ready  to  display  its  output.  The 
user,  through  the  use  of  menus,  selects  how  the  information  is  to  be  displayed  (i.e.,  to 
the  screen  printer,  plotter,  or  disk  file).  Examples  are  shown  in  figures  1  through  5. 

CAT’s  output  consists  of  several  tables  and  diagrams,  including  the  source  listing, 
data  flow  diagram,  and  test  paths.  This  output  will  provide  the  developer  or  assessor  a 
pictorial  and  quantitative  representation  of  the  software  logic.  The  first  output  consists 
of  a  source  listing  of  the  entire  file.  This  listing  contains  two  major  sections.  The  first  is 
the  module  directory  (fig.  1).  It  lists  the  modules  found  in  this  file  and  presents  the  vital 
statistics  for  each  of  the  modules.  Modules  are  listed  in  the  order  they  were  found  in  the 
source  file,  i.e.,  by  ascending  line  number.  The  directory  shows  the  name  of  each 
module,  the  starting  line,  the  number  of  lines  in  the  module,  and  cyclomatic  complexity. 
At  the  far  left  of  each  line  in  the  directory,  a  letter  is  given  to  the  module.  This  letter  is 
used  in  the  body  of  the  listing  to  identify  code  belonging  to  the  module. 
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The  second  section  is  the  listing  itself  (fig.  2).  The  leftmost  columns  of  each  line  are 
used  to  show  the  correspondence  between  the  source  code  and  the  DFD.  First  is  the 
line  number,  which  is  the  count  of  lines  from  the  top  of  the  file.  Next  is  the  module  letter 
which  identifies  the  module  containing  this  line  of  code.  Following  the  module  letter  is 
the  list  of  nodes  that  are  represented,  wholly  or  partially,  by  this  line.  Each  module  may 
be  examined  individually  if  desired. 

In  looking  at  the  DFD  (fig.  3),  there  is  a  summary  of  information  at  the  top  of  the 
DFD.  Listed  are  the  source  filename,  the  module  within  the  source  file  being  analyzed, 
the  module’s  complexity,  the  number  of  lines  of  code  in  that  module,  and  the  date  and 
time  of  the  analysis.  Also  included  is  a  color  scheme  for  the  DFD  to  indicate  program 
flow  direction. 

The  numbers  on  the  DFD  represent  nodes,  a  block  of  statements  where  the 
program  flow  is  sequential.  Edges  represent  the  program’s  branches  taken  between 
blocks.  Edges  that  cause  loops  are  shown  in  one  color.  These  edges  always  flow  from 
the  bottom  to  the  top  of  the  page.  Edges  that  perform  a  structured  exit  of  a  loop  (such 
as  a  WHILE  or  FOR  statement)  are  shown  in  another  color.  These  edges  flow  from  the 
top  to  the  bottom  of  the  page.  All  remaining  edges  are  drawn  in  a  third  color.  These 
edges  also  go  from  the  top  to  the  bottom  of  the  page.  Another  indication  of  the  direction 
and  function  of  an  edge  is  its  shape.  Loops  or  loop  exits  are  always  drawn  as  curved 
lines.  Other  edges  are  drawn  as  straight  lines  unless  they  must  be  curved  to  avoid 
colliding  with  another  node. 

If  a  file  analyzed  contained  several  modules,  and  one  of  these  modules  has  a 
complexity  greater  than  10,  or  one  of  these  modules  has  been  changed,  these  modules 
would  be  candidates  for  further  review.  By  going  through  the  menu  interface,  a  specific 
module  can  be  selected  for  individual  review.  The  module’s  corresponding  source 
listing,  DFD,  and  basis  paths  can  be  examined.  CAT  can  automatically  determine  the 
basis  paths  and  display  them  in  two  fashions.  The  first  is  by  a  series  of  numbered 
nodes,  such  as  0-1-2-3-5-6-7-14-7-8-9-10-13-3-4-16-17,  corresponding  to  the  DFD  (fig. 
4).  The  second  means  is  by  graphing  the  test  paths  individually  (fig.  5). 


BENEFITS 

The  significant  benefits  derived  from  a  quality  design  are  realized  throughout  the  full 
life-cycle  of  the  program  (i.e.,  development,  production,  post-deployment),  as  opposed 
to  benefits  derived  from  an  instantaneous  assessment.  CAT  provides  an  overall  im¬ 
provement  in  design  by  enforcing  structured  programming  techniques  upon  the 
software  programmer,  thus  designing  quality  into  the  software.  Imposition  of  the  metric 
would  also  allow  early  inspection  and  diagnosis  of  the  problem  areas  in  the  software 
logic.  For  example,  during  initial  design,  CAT  can  be  used  to  assure  a  low  complexity  in 
the  PDL.  By  using  the  PDL,  DFDs,  source  listing  and  test  paths,  one  can  perform  a 
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walk-through  to  check  for  logic  or  function  errors  before  coding  takes  place.  When 
identified  problems  are  resolved,  proceed  to  the  coding  phase.  The  developed  code 
can  then  be  passed  through  CAT  again,  coming  up  with  another  set  of  output.  These 
two  sets  of  output,  one  from  the  PDL  and  one  from  the  code,  can  then  be  compared 
against  one  another.  There  should  not  be  significant  differences  between  the  two.  For 
example,  if  a  PDL  module  had  a  complexity  of  five,  the  code  implementing  that  module 
should  not  have  a  corresponding  complexity  of  25.  The  complexities  are  likely  to 
increase  implementing  the  PDL  into  code,  but  they  should  not  be  significant  differences. 
If  there  are  differences  such  as  these,  you  know  that  the  requirements  in  the  PDL  are 
not  correctly  implemented  into  the  code.  This  is  another  way  errors  could  be  uncovered 
early  in  the  life  cycle  before  they  propagate  to  unmanageable  portions. 

A  major  benefit  derived  from  the  use  of  the  metric  would  be  an  improvement  in  the 
ease  and  efficiency  of  testing.  By  concentrating  on  the  basis  paths,  the  testing  effort  is 
prioritized  and  minimized.  The  test  paths  as  a  whole  can  be  used  for  unit  testing  of  the 
module  and  can  be  part  of  its  unit  development  folder.  The  tool  can  also  be  used  in 
conjunction  with  acceptance  testing  as  a  means  of  verifying  the  performance  of  the 
software. 

From  a  post  deployment  perspective,  the  metric  is  a  means  of  obtaining  a  measure 
of  software  supportability.  Suppose  you  would  like  to  know  the  effects  of  a  proposed 
change  in  the  software.  In  looking  at  figure  2  you  could  locate  the  lines  of  source  code 
you  intend  to  change.  By  looking  at  the  corresponding  node  letters  on  the  left-hand 
side,  you  could  then  go  to  figure  3  and  note  its  corresponding  effect  upon  the  other 
nodes.  If  the  area  in  question  is  highly  structured,  a  change  would  probably  have  little 
impact.  However,  if  the  area  was  highly  unstructured,  a  change  would  probably  have  a 
drastic  impact.  The  diagrams  could  also  be  used  in  a  reverse  fashion.  For  example,  if 
you  notice  that  a  particular  area  of  the  DFD  was  cluttered  and  you  wanted  to  clean  it  up, 
you  could  note  which  nodes  were  involved.  You  could  then  go  to  figure  2  and  find  the 
corresponding  source  code  you  would  have  to  change. 

CAT,  because  of  its  data  flow  analysis,  not  only  detects  a  modification  but  relates 
the  modification  to  a  particular  flow  area  in  the  program.  To  fully  characterize  a 
program  change,  CAT  can  flag  changes  in  the  program  flow  paths.  Thus  the  complexity 
metric  will  provide  an  efficient  means  of  identifying  regression  test  cases  to  verify  those 
portions  of  the  software  program  which  are  changed. 

CAT  as  an  automated  tool  can  eliminate  manual  computation  errors,  eliminate 
subjectivity,  and  significantly  reduce  the  number  of  manhours  required  for  manual 
computations  of  the  metric.  Preliminary  forecasts  estimate  that  the  effort  required  to 
apply  the  metric  manually  to  a  software  development  process  is  approximately  3%  to 
7%  of  the  overall  development  effort  (assuming  no  learning  curve  is  required).  This 
estimate  is  based  on  our  own  in-house  experience  with  MCCR  software  using  manual 
calculations  of  complexity.  We  project  that  with  the  automated  tool  the  effort  would  drop 
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by  about  one  or  two  orders  of  magnitude.  In  either  case,  taking  the  extra  effort  to  use 
the  metric  provides  a  vehicle  for  detecting  and  correcting  bugs  earlier  in  the  life  cycle. 
This  in  turn  could  amount  to  significant  benefits  and  savings  in  terms  of  development 
and  testing  costs,  productivity  gains,  and  software  quality  and  reliability. 


CONCLUSIONS 

The  use  of  CAT  as  a  quality  assurance  tool  provides  a  quantitative  measure  of 
software  quality,  structure,  robustness,  testability,  and  maintainability.  Imposition  of  the 
cyclomatic  complexity  metric  on  the  developer  during  the  early  phases  of  the  life  cycle 
will  result  in  a  quality  design.  Benefits  will  then  be  realized  throughout  the  life  cycle  of 
the  program. 

The  complexity  metric  has  been  incorporated  as  a  requirement  in  several  mission 
critical  computer  resources  programs  currently  underway.  A  follow-on  study  will  be 
performed  which  will  quantify  these  benefits  in  terms  of  time,  labor,  and  cost  savings,  as 
well  as  correlations  between  complexity  and  reliability,  error  rate,  modifiability,  etc.  The 
study  will  be  performed  using  data  gathered  from  users  of  the  metric  and  tool.  The 
complexity  analysis  tool  can  be  obtained  by  contacting: 

U.S.  Army  AMCCOM 
AMSMC-QAH-A(D) 

Bldg  62 

Picatinny  Arsenal,  NJ  07806-5000 
(201)  724-4849,  Autovon  880-4849 
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Complexity  Analysis  Tool 
Listing  from  source  file  xsample.ada 


Module 

Module 

Cyclomatic 

Starting 

Number 

letter 

name 

complexity 

line 

of  lines 

A 

SAMPLE_ADA 

6 

6 
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Figure  1 .  Module  directory 
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CAT  listing 


Source  file;xsample.ada 


Line  Module/Node 


Source  Text 


1‘cige 

1 

CAT  l.i  St  1  n 

Source  f  i  1  e  x  s  ump  1  e  .  ;i  cl  ii 

l.i  ne 

Mddu  1  e  N('iJ  e 

Stturce  'I'exi 

1 

prt'ceciure  S.A\ll’l.h'.^AJ).A  is 

2 

1  JKST;  1  S'l  lXii'.K  ; 

\i:\r;  IMlXiliK: 

1 

C:  (ll.\k.A(*ri:H: 

5 

(t 

Ad 

h  e  g  1  n 

7 

\1 

1  1  r  1  KS'I  -  d  then 

8 

A1  5 

put  line!"  T  i  r  s  t  e  i]  u  .i  1  to  zero 

A  2 

else 

1(1 

\2 

\i;\l  :  =  1  1  KS  1  : 

1  1 

A,1 

vch  lie  (  sl  \  r  -d  )  loop 

1  2 

A  8 

A(. 

put  line!"  Next  not  ec) u j 

I  ,1 

A  7 

lor  1  in  1 . . 20  loop 

1  t 

A1  4 

C:  (■  *  1  ; 

1  8 

All 

pu  t (  ( ■  )  ; 

1  u 

A  8 

A1  4 

e  nd  1  ('op  : 

1  7 

A  9 

c  j  s  e  ( '  IS 

I  8 

A1  2 

w  h  e  n  1  =  ■  Al  )l ) ; 

1  V 

A1  1 

\xh  e n  2  1  >14.1111 ; 

2(1 

A1  0 

when  others  =  PK 1 N 1 

21 

\1  .1 

end  t  ;i  s  e  ; 

2  2 

A  4 

A 1  2 

end  loop: 

2  ,1 

A1  6 

end  1  (  ; 

2  1 

A1  7 

end  S.A\11M.i:_Al)A; 

Figure  2.  Module  listing 


/  e  r  I'  ■■  1  : 
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Module  Name:  SAMPI.i;_ADA 
Comp  I  ex i ty  6 
l.anjjuage  :  Ada 


I'e  St  Path  Listing 


I'i  I  e  :  xsamp  I  e  .  ada 
Date/Time:  Mon  Sep  26  09:38 
Page  1 


Basel  i  n  e  : 


Te  St  Path  1 


Test  Path  2 


Test  Path  3 


Test  Path  4 


'e  s  t  Path  5 


0  1  15  16  17 


0  1  2  3  4  16  17 


O  1  2  3  5  6  7  8  9  10  1  3  3  4  16  1  7 

O  1  2  3  5  6  7  1 4  7  8  9  10  1 3  3  4  1 6  1 7 

O  1  2  3  5  6  7  8  9  12  13  3  4  16  17 

0  1  2  3  5  6  7  8  9  1 1  1 3  3  4  16  1 7 


Figure  4.  Test  path  listing 
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Upward  flows 
Loop  exits 
Plain  Edges 
Mon  Sep  26  08:23 
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